Add fast local tool search with hybrid mode by vl3c · Pull Request #47 · vl3c/MatHud

vl3c · 2026-02-21T14:32:42Z

Summary

Adds a fast local keyword/category search engine (~0.01ms per query) that eliminates API calls for tool discovery in the common case
Default mode is hybrid: uses local search first, falls back to gpt-5-nano API only when confidence is low
Switches API fallback model from gpt-4.1-mini to gpt-5-nano (8x cheaper input costs)
Includes LRU result cache with 5-minute TTL

Search accuracy (180-case benchmark)

Metric	Score
Top-1	91.7%
Top-3	100%
Top-5	100%

New tests

17 mocked end-to-end prompt pipeline tests (real local search + real filtering, only OpenAI API mocked)
62 creative prompt tests (casual language, homework, physics, geometry, statistics, graph theory, workspace, canvas, transforms, edge cases)
Offline benchmark + latency tests (p99 < 5ms)
23 unit tests for local search, cache, mode switching, category registry

New tooling

scripts/compare_search_modes.py for side-by-side mode comparison with disagreement analysis

Configuration

Set TOOL_SEARCH_MODE env var: hybrid (default), local, or api

Test plan

All server tests pass (0 failures)
17 prompt pipeline tests pass (streaming + non-streaming paths, filtering verification)
159 tool search tests pass (benchmark + creative + unit)
mypy clean on static/tool_search_service.py
Integration smoke test: start app with hybrid mode, send chat messages, verify tools are discovered correctly

🤖 Generated with Claude Code

Replace the API-only tool search with a fast local keyword/category search engine that eliminates API latency in the common case (~0.01ms vs 1-3s per query). Architecture: - 13 tool categories with keyword triggers and inverted indices built at module load time for O(1) token lookups - Multi-signal scoring: category boost, name/description index match, exact name match, action-verb alignment, and 40+ intent-based disambiguation rules - LRU result cache with 5-minute TTL (100 entries max) - Lazy OpenAI client — local mode never touches the network Search modes (TOOL_SEARCH_MODE env var): - hybrid (default): local first, falls back to API when confidence low - local: keyword-only, no API call - api: original GPT-based semantic search Also switches API fallback model from gpt-4.1-mini to gpt-5-nano (8x cheaper input costs). Accuracy on 180-case benchmark: 91.7% top-1, 100% top-3, 100% top-5.

- test_tool_search_local.py: offline benchmark (190-case dataset), latency tests (p99 < 5ms), and 62 creative prompt tests covering casual language, homework scenarios, physics/engineering, geometry constructions, statistics, graph theory, workspace ops, canvas ops, transforms, ambiguous terms, and edge cases - test_tool_search_service.py: 23 new unit tests for local search, cache, mode switching, category registry, and lazy client init; update existing API tests for mode-aware fixtures - test_tool_discovery_live.py: add search_ms and search_mode columns to CSV output for latency tracking - scripts/compare_search_modes.py: side-by-side comparison of search modes with disagreement analysis and CSV export

- Reference Manual: add ToolSearchService section with architecture, class methods, and environment variable documentation - README.md: add TOOL_SEARCH_MODE to configuration example - CLAUDE.md: add TOOL_SEARCH_MODE to .env configuration section - Project Architecture: update tool count and add tool discovery line

17 tests verify the full pipeline: natural-language prompt → real local tool search → real filtering → correct tool calls returned to the client. Only OpenAI API calls are mocked; _intercept_search_tools and ToolSearchService.search_tools_local run for real with TOOL_SEARCH_MODE=local. Streaming (14 tests): circle, triangle, derivative, solve, distribution, descriptive stats, graph, undo, save workspace, rotate, multi-tool, filtering of irrelevant tools, essential passthrough, no-search passthrough. Non-streaming (3 tests): o3 reasoning model, gpt-4.1 chat completion, chat completion with irrelevant tool filtering.

vl3c added 4 commits February 21, 2026 16:29

vl3c merged commit 526680d into main Feb 21, 2026
1 check passed

vl3c deleted the pr-46 branch February 21, 2026 15:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fast local tool search with hybrid mode#47

Add fast local tool search with hybrid mode#47
vl3c merged 4 commits intomainfrom
pr-46

vl3c commented Feb 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vl3c commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Search accuracy (180-case benchmark)

New tests

New tooling

Configuration

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vl3c commented Feb 21, 2026 •

edited

Loading